Serveur d'exploration sur la recherche en informatique en Lorraine

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Island Grammar-Based Parsing Using GLL and Tom

Identifieur interne : 001561 ( Main/Exploration ); précédent : 001560; suivant : 001562

Island Grammar-Based Parsing Using GLL and Tom

Auteurs : Ali Afroozeh [Pays-Bas] ; Jean-Christophe Bach [France] ; Mark Van Den Brand [Pays-Bas] ; Adrian Johnstone [Royaume-Uni] ; Maarten Manders [Pays-Bas] ; Pierre-Etienne Moreau [France] ; Elizabeth Scott [Royaume-Uni]

Source :

RBID : ISTEX:E0F72B61DB77182F93E495438C84CA6C8C51FC1C

Abstract

Abstract: Extending a language by embedding within it another language presents significant parsing challenges, especially if the embedding is recursive. The composite grammar is likely to be nondeterministic as a result of tokens that are valid in both the host and the embedded language. In this paper we examine the challenges of embedding the Tom language into a variety of general-purpose high level languages. Tom provides syntax and semantics for advanced pattern matching and tree rewriting facilities. Embedded Tom constructs are translated into the host language by a preprocessor, the output of which is a composite program written purely in the host language. Tom implementations exist for Java, C, C#, Python and Caml. The current parser is complex and difficult to maintain. In this paper, we describe how Tom can be parsed using island grammars implemented with the Generalised LL (GLL) parsing algorithm. The grammar is, as might be expected, ambiguous. Extracting the correct derivation relies on our disambiguation strategy which is based on pattern matching within the parse forest. We describe different classes of ambiguity and propose patterns for resolving them.

Url:
DOI: 10.1007/978-3-642-36089-3_13


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Island Grammar-Based Parsing Using GLL and Tom</title>
<author>
<name sortKey="Afroozeh, Ali" sort="Afroozeh, Ali" uniqKey="Afroozeh A" first="Ali" last="Afroozeh">Ali Afroozeh</name>
</author>
<author>
<name sortKey="Bach, Jean Christophe" sort="Bach, Jean Christophe" uniqKey="Bach J" first="Jean-Christophe" last="Bach">Jean-Christophe Bach</name>
</author>
<author>
<name sortKey="Van Den Brand, Mark" sort="Van Den Brand, Mark" uniqKey="Van Den Brand M" first="Mark" last="Van Den Brand">Mark Van Den Brand</name>
</author>
<author>
<name sortKey="Johnstone, Adrian" sort="Johnstone, Adrian" uniqKey="Johnstone A" first="Adrian" last="Johnstone">Adrian Johnstone</name>
</author>
<author>
<name sortKey="Manders, Maarten" sort="Manders, Maarten" uniqKey="Manders M" first="Maarten" last="Manders">Maarten Manders</name>
</author>
<author>
<name sortKey="Moreau, Pierre Etienne" sort="Moreau, Pierre Etienne" uniqKey="Moreau P" first="Pierre-Etienne" last="Moreau">Pierre-Etienne Moreau</name>
</author>
<author>
<name sortKey="Scott, Elizabeth" sort="Scott, Elizabeth" uniqKey="Scott E" first="Elizabeth" last="Scott">Elizabeth Scott</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:E0F72B61DB77182F93E495438C84CA6C8C51FC1C</idno>
<date when="2013" year="2013">2013</date>
<idno type="doi">10.1007/978-3-642-36089-3_13</idno>
<idno type="url">https://api.istex.fr/ark:/67375/HCB-XB1KR35C-L/fulltext.pdf</idno>
<idno type="wicri:Area/Istex/Corpus">003564</idno>
<idno type="wicri:explorRef" wicri:stream="Istex" wicri:step="Corpus" wicri:corpus="ISTEX">003564</idno>
<idno type="wicri:Area/Istex/Curation">003522</idno>
<idno type="wicri:Area/Istex/Checkpoint">000187</idno>
<idno type="wicri:explorRef" wicri:stream="Istex" wicri:step="Checkpoint">000187</idno>
<idno type="wicri:doubleKey">0302-9743:2013:Afroozeh A:island:grammar:based</idno>
<idno type="wicri:Area/Main/Merge">001573</idno>
<idno type="wicri:Area/Main/Curation">001561</idno>
<idno type="wicri:Area/Main/Exploration">001561</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a" type="main" xml:lang="en">Island Grammar-Based Parsing Using GLL and Tom</title>
<author>
<name sortKey="Afroozeh, Ali" sort="Afroozeh, Ali" uniqKey="Afroozeh A" first="Ali" last="Afroozeh">Ali Afroozeh</name>
<affiliation wicri:level="1">
<country xml:lang="fr">Pays-Bas</country>
<wicri:regionArea>Eindhoven University of Technology, NL-5612 AZ, Eindhoven</wicri:regionArea>
<wicri:noRegion>Eindhoven</wicri:noRegion>
</affiliation>
<affiliation></affiliation>
</author>
<author>
<name sortKey="Bach, Jean Christophe" sort="Bach, Jean Christophe" uniqKey="Bach J" first="Jean-Christophe" last="Bach">Jean-Christophe Bach</name>
<affiliation wicri:level="3">
<country xml:lang="fr">France</country>
<wicri:regionArea>Inria, F-54600, Villers-lès-Nancy</wicri:regionArea>
<placeName>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
<settlement type="city">Villers-lès-Nancy</settlement>
</placeName>
</affiliation>
<affiliation wicri:level="4">
<country xml:lang="fr">France</country>
<wicri:regionArea>LORIA, UMR 7503, Université de Lorraine, F-54500, Vandœuvre-lès-Nancy</wicri:regionArea>
<placeName>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
<settlement type="city">Vandœuvre-lès-Nancy</settlement>
</placeName>
<orgName type="university">Université de Lorraine</orgName>
</affiliation>
<affiliation wicri:level="3">
<country xml:lang="fr">France</country>
<wicri:regionArea>LORIA, UMR 7503, CNRS, F-54500, Vandœuvre-lès-Nancy</wicri:regionArea>
<placeName>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
<settlement type="city">Vandœuvre-lès-Nancy</settlement>
</placeName>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">France</country>
</affiliation>
</author>
<author>
<name sortKey="Van Den Brand, Mark" sort="Van Den Brand, Mark" uniqKey="Van Den Brand M" first="Mark" last="Van Den Brand">Mark Van Den Brand</name>
<affiliation wicri:level="1">
<country xml:lang="fr">Pays-Bas</country>
<wicri:regionArea>Eindhoven University of Technology, NL-5612 AZ, Eindhoven</wicri:regionArea>
<wicri:noRegion>Eindhoven</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">Pays-Bas</country>
</affiliation>
</author>
<author>
<name sortKey="Johnstone, Adrian" sort="Johnstone, Adrian" uniqKey="Johnstone A" first="Adrian" last="Johnstone">Adrian Johnstone</name>
<affiliation wicri:level="4">
<country xml:lang="fr">Royaume-Uni</country>
<wicri:regionArea>Royal Holloway, University of London, TW20 0EX, Surrey, Egham</wicri:regionArea>
<orgName type="university">Université de Londres</orgName>
<placeName>
<settlement type="city">Londres</settlement>
<region type="country">Angleterre</region>
<region type="région" nuts="1">Grand Londres</region>
</placeName>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">Royaume-Uni</country>
</affiliation>
</author>
<author>
<name sortKey="Manders, Maarten" sort="Manders, Maarten" uniqKey="Manders M" first="Maarten" last="Manders">Maarten Manders</name>
<affiliation wicri:level="1">
<country xml:lang="fr">Pays-Bas</country>
<wicri:regionArea>Eindhoven University of Technology, NL-5612 AZ, Eindhoven</wicri:regionArea>
<wicri:noRegion>Eindhoven</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">Pays-Bas</country>
</affiliation>
</author>
<author>
<name sortKey="Moreau, Pierre Etienne" sort="Moreau, Pierre Etienne" uniqKey="Moreau P" first="Pierre-Etienne" last="Moreau">Pierre-Etienne Moreau</name>
<affiliation wicri:level="3">
<country xml:lang="fr">France</country>
<wicri:regionArea>Inria, F-54600, Villers-lès-Nancy</wicri:regionArea>
<placeName>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
<settlement type="city">Villers-lès-Nancy</settlement>
</placeName>
</affiliation>
<affiliation wicri:level="4">
<country xml:lang="fr">France</country>
<wicri:regionArea>LORIA, UMR 7503, Université de Lorraine, F-54500, Vandœuvre-lès-Nancy</wicri:regionArea>
<placeName>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
<settlement type="city">Vandœuvre-lès-Nancy</settlement>
</placeName>
<orgName type="university">Université de Lorraine</orgName>
</affiliation>
<affiliation wicri:level="3">
<country xml:lang="fr">France</country>
<wicri:regionArea>LORIA, UMR 7503, CNRS, F-54500, Vandœuvre-lès-Nancy</wicri:regionArea>
<placeName>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
<settlement type="city">Vandœuvre-lès-Nancy</settlement>
</placeName>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">France</country>
</affiliation>
</author>
<author>
<name sortKey="Scott, Elizabeth" sort="Scott, Elizabeth" uniqKey="Scott E" first="Elizabeth" last="Scott">Elizabeth Scott</name>
<affiliation wicri:level="4">
<country xml:lang="fr">Royaume-Uni</country>
<wicri:regionArea>Royal Holloway, University of London, TW20 0EX, Surrey, Egham</wicri:regionArea>
<orgName type="university">Université de Londres</orgName>
<placeName>
<settlement type="city">Londres</settlement>
<region type="country">Angleterre</region>
<region type="région" nuts="1">Grand Londres</region>
</placeName>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">Royaume-Uni</country>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="s" type="main" xml:lang="en">Lecture Notes in Computer Science</title>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Abstract: Extending a language by embedding within it another language presents significant parsing challenges, especially if the embedding is recursive. The composite grammar is likely to be nondeterministic as a result of tokens that are valid in both the host and the embedded language. In this paper we examine the challenges of embedding the Tom language into a variety of general-purpose high level languages. Tom provides syntax and semantics for advanced pattern matching and tree rewriting facilities. Embedded Tom constructs are translated into the host language by a preprocessor, the output of which is a composite program written purely in the host language. Tom implementations exist for Java, C, C#, Python and Caml. The current parser is complex and difficult to maintain. In this paper, we describe how Tom can be parsed using island grammars implemented with the Generalised LL (GLL) parsing algorithm. The grammar is, as might be expected, ambiguous. Extracting the correct derivation relies on our disambiguation strategy which is based on pattern matching within the parse forest. We describe different classes of ambiguity and propose patterns for resolving them.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>France</li>
<li>Pays-Bas</li>
<li>Royaume-Uni</li>
</country>
<region>
<li>Angleterre</li>
<li>Grand Est</li>
<li>Grand Londres</li>
<li>Lorraine (région)</li>
</region>
<settlement>
<li>Londres</li>
<li>Vandœuvre-lès-Nancy</li>
<li>Villers-lès-Nancy</li>
</settlement>
<orgName>
<li>Université de Londres</li>
<li>Université de Lorraine</li>
</orgName>
</list>
<tree>
<country name="Pays-Bas">
<noRegion>
<name sortKey="Afroozeh, Ali" sort="Afroozeh, Ali" uniqKey="Afroozeh A" first="Ali" last="Afroozeh">Ali Afroozeh</name>
</noRegion>
<name sortKey="Manders, Maarten" sort="Manders, Maarten" uniqKey="Manders M" first="Maarten" last="Manders">Maarten Manders</name>
<name sortKey="Manders, Maarten" sort="Manders, Maarten" uniqKey="Manders M" first="Maarten" last="Manders">Maarten Manders</name>
<name sortKey="Van Den Brand, Mark" sort="Van Den Brand, Mark" uniqKey="Van Den Brand M" first="Mark" last="Van Den Brand">Mark Van Den Brand</name>
<name sortKey="Van Den Brand, Mark" sort="Van Den Brand, Mark" uniqKey="Van Den Brand M" first="Mark" last="Van Den Brand">Mark Van Den Brand</name>
</country>
<country name="France">
<region name="Grand Est">
<name sortKey="Bach, Jean Christophe" sort="Bach, Jean Christophe" uniqKey="Bach J" first="Jean-Christophe" last="Bach">Jean-Christophe Bach</name>
</region>
<name sortKey="Bach, Jean Christophe" sort="Bach, Jean Christophe" uniqKey="Bach J" first="Jean-Christophe" last="Bach">Jean-Christophe Bach</name>
<name sortKey="Bach, Jean Christophe" sort="Bach, Jean Christophe" uniqKey="Bach J" first="Jean-Christophe" last="Bach">Jean-Christophe Bach</name>
<name sortKey="Bach, Jean Christophe" sort="Bach, Jean Christophe" uniqKey="Bach J" first="Jean-Christophe" last="Bach">Jean-Christophe Bach</name>
<name sortKey="Moreau, Pierre Etienne" sort="Moreau, Pierre Etienne" uniqKey="Moreau P" first="Pierre-Etienne" last="Moreau">Pierre-Etienne Moreau</name>
<name sortKey="Moreau, Pierre Etienne" sort="Moreau, Pierre Etienne" uniqKey="Moreau P" first="Pierre-Etienne" last="Moreau">Pierre-Etienne Moreau</name>
<name sortKey="Moreau, Pierre Etienne" sort="Moreau, Pierre Etienne" uniqKey="Moreau P" first="Pierre-Etienne" last="Moreau">Pierre-Etienne Moreau</name>
<name sortKey="Moreau, Pierre Etienne" sort="Moreau, Pierre Etienne" uniqKey="Moreau P" first="Pierre-Etienne" last="Moreau">Pierre-Etienne Moreau</name>
</country>
<country name="Royaume-Uni">
<region name="Angleterre">
<name sortKey="Johnstone, Adrian" sort="Johnstone, Adrian" uniqKey="Johnstone A" first="Adrian" last="Johnstone">Adrian Johnstone</name>
</region>
<name sortKey="Johnstone, Adrian" sort="Johnstone, Adrian" uniqKey="Johnstone A" first="Adrian" last="Johnstone">Adrian Johnstone</name>
<name sortKey="Scott, Elizabeth" sort="Scott, Elizabeth" uniqKey="Scott E" first="Elizabeth" last="Scott">Elizabeth Scott</name>
<name sortKey="Scott, Elizabeth" sort="Scott, Elizabeth" uniqKey="Scott E" first="Elizabeth" last="Scott">Elizabeth Scott</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Lorraine/explor/InforLorV4/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001561 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 001561 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Lorraine
   |area=    InforLorV4
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     ISTEX:E0F72B61DB77182F93E495438C84CA6C8C51FC1C
   |texte=   Island Grammar-Based Parsing Using GLL and Tom
}}

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Jun 10 21:56:28 2019. Site generation: Fri Feb 25 15:29:27 2022